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ASYMPTOTIC INFERENCE FOR HIGH-DIMENSIONAL DATA 

By Jim Kuelbs and Anand N. Vidyashankar-'^ 

University of Wisconsin and Cornell University 

In this paper, we study inference for high-dimensional data char- 
acterized by small sample sizes relative to the dimension of the data. 
In particular, we provide an infinite-dimensional framework to study 
statistical models that involve situations in which (i) the number of 
parameters increase with the sample size (that is, allowed to be ran- 
dom) and (ii) there is a possibility of missing data. Under a variety 
of tail conditions on the components of the data, we provide precise 
conditions for the joint consistency of the estimators of the mean. 
In the process, we clarify and improve some of the recent consis- 
tency results that appeared in the literature. An important aspect of 
the work presented is the development of asymptotic normality re- 
sults for these models. As a consequence, we construct different test 
statistics for one-sample and two-sample problems concerning the 
mean vector and obtain their asymptotic distributions as a corollary 
of the infinite-dimensional results. Finally, we use these theoretical 
results to develop an asymptotically justifiable methodology for data 
analyses. Simulation results presented here describe situations where 
the methodology can be successfully applied. They also evaluate its 
robustness under a variety of conditions, some of which are substan- 
tially different from the technical conditions. Comparisons to other 
methods used in the literature are provided. Analyses of real-life data 
is also included. 

1. Introduction. Modern scientific technology is providing a class of sta- 
tistical problems that typically involve data that are high dimensional, and 
frequently lead to questions involving simultaneous inference for large sets 
of parameters. The number of parameters in these datasets is often random, 
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and grows rapidly in comparison to the sample size; furthermore, there can 
be missing observations. Microarrays epitomize this situation, but similar 
problems arise in other areas such as polymerase chain reactions, proteomics, 
functional magnetic resonance imaging, and astronomy. For example, in mi- 
croarray experiments the number of expressed genes differ between repli- 
cates, and certain genes do not express in all replications, leading to missing 
data. Statistical analyses of such problems is an area of increasing concern, 
and various statistical models and methods have been developed to analyze 
these situations. Some recent references in this area include [7] and [18], 
which study the large p small n problem. The reference [15] studies the joint 
asymptotics in the context of general regression problems when the number 
of parameters diverge to infinity with the sample size. In particular, [7] in- 
vestigates the simultaneous estimation of the marginal distributions in the 
large p small n problem, and it describes how these results can then be used 
to control the so-called false discovery rates (FDR). 

The primary focus of this paper is to develop a general framework for 
joint statistical analysis of parameters in high-dimensional problems. Fur- 
thermore, we allow a random number of parameters and missing data in our 
data structures. This is achieved using infinite-dimensional techniques. Al- 
though the methods of our paper apply generally to many high-dimensional 
data problems, we will frequently use the terminology from microarrays to 
facilitate connections to one of the contemporary scientific disciplines. Now 
we turn to some specifics of our model. 

For each fixed integer n > 1, we begin with a collection of independent 
sequences of real valued random variables {^n,i,j-j > 1}- All are assumed 
to be defined on a common probability space, and there is no dependence 
relationship assumed as n and j vary. In the context of microarrays, for n 
fixed, each of these sequences represents the expression levels of genes in one 
replication of the experiment. The index n can be interpreted as either the 
time frame or as a label for the laboratory where the experiment is being 
performed. In particular, the random variable (,n.i,j can then be thought of 
as the expression level of the j'th gene in the ith replicate with index n. The 
number of replicates, for fixed index n, could be any integer r(n), but for 
the sake of simplicity we take r(n) = n. Nevertheless, the techniques of this 
paper can be applied to develop results for other choices of r(n). 

Since the expressed genes between replicates may not coincide, either due 
to the random number that appear or for other reasons (which can be viewed 
as random deletions), we incorporate these two nonmutually exclusive pos- 
sibilities into our model. We let Nn^i denote the random number of variables 
within the ith replicate having index n. We also assume for each integer 
n > 1 that {Nn^i : i > 1} is an i.i.d. sequence of integer valued random vari- 
ables with P{Nn,i > 1) = 1. Of course, in real datasets for fixed n, the row 
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lengths are bounded, but our results also apply to situations where they are 
unbounded. 

To model missing data, we postulate that the missing mechanism is 
independent of the expression level and the random number of parame- 
ters involved. For this reason, we introduce the Bernoulli random variables 
{Rn,i,j ■n'>l,i>l,j> 1} to represent missing data indicators, where 

(1.1) P{R^^^^j = l)=p forn>l,i>l,j>l. 

We will assume that < p < 1, and also that the sequences {Cn,i j : > 1, i > 
1>J > ^},{Rn,i,j ■n > 1,« > l,j > 1}, and {Nn,i-n > l,i > 1} are indepen- 
dent. The case p = 1 corresponds to the case that there is no missing data. 

In traditional multivariate analysis, such data is typically represented as 
random vectors in a fixed dimension d. However, since we are studying the 
model in which the dimension of the parameter vector diverges to infinity 
with the sample size, we represent it as a vector in the linear space of 
all real sequences. That is, we set 

(1-2) ^^n,i — ^^^n,i,jGn,i,j^j J ^ — 1, ... ,71, 

where 

(1.3) 9n,i,j = I{j<Nn,i)Rn,i,j, 

for n>l,i>l,j>l, and {ej : j > 1} is the canonical basis for that is, 
~ {^j.fc • ^ — 1} fo'^ i = 1) 2, . . . , where 5j^k = 1 for j = k and if j 7^ k. In 
the context of microarrays, the coordinates of the vector X„^j are thought to 
be the "normalized expression levels" of genes identified in the ith replicate 
with index n. In probabilistic terms, the collection ^n,i,^n,2, ■ ■ ■ , ^n,n forms 
a triangular array of n independent i?°°-valued random vectors. Let N* = 
maxi<j<„ Nn,i denote the maximum number of components (columns) in the 
dataset; or in the context of microarrays, the total number of expressed genes 
present. If P{N* < 00) = 1, the components of X^^j, namely Cn,i,jSn,ij, equal 
for j > Nn/L- In other words, X„^j G cq, where cq is the linear space of all 
real sequences converging to 0. Hence, we will be concerned with asymptotic 
inference for data in cq. Throughout the paper, we allow the possibility that 
P{N* = 00) > 0. We also will use the notation x = Ylj>i ^j^j to denote a 
typical vector in R°°, where {ej : j > 1} denotes the canonical basis vectors 
defined above. 

The space cq, with the usual sup-norm given by 



(1.4) 
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is naturally appropriate when studying the asymptotic inference for a one- 
sample problem using the maximum of suitable "averages" of gene expres- 
sions. In our data analyses, we also use £p subspaces, 2 < p < oo, determined 
by the norm 

(1-5) iix||,= j;ix,r 

when 2 < /? < oo, and by (1.4) when p = oo. Related theoretical results for 
these norms are studied in [8]. Our main asymptotic results concern the 
statistics 



(1-6) Sn,n — ^n,i 

1=1 

and 

i=l j>l Vnj i=l j=l ^n,j 

where 

(1.8) Vn,j = max<^ 1, 9nAj \ , ra>l,j>l. 

i=i J 

Here, the coordinate wise random-normalizers Vnj take into account the 
differences amongst columns due to missing data and random row lengths, 
and if we replace the Vnj in Sn,n by v}^"^ , then we obtain Sn,n/n^^'^ ■ Our 
results include consistency, rates of convergence, and asymptotic normality 
for these sums. The statistic S„.„ is important when we consider asymp- 
totic normality in our model, as it essentially normalizes each column by 
the square root of the number of terms in that column. We also study the 
statistic 



(1.9) T„,. =y^y: ^^^^e, ^^e,. 



1=1 ]>1 '"' 1=1 J=l '-' 

When Nn^i = Pm where Pn is nonrandom, exponential in n, and p=l, this 
is sometimes called the large p small n problem, and [7] and [18] studied the 
behavior of Sn,n in the sup-norm under various assumptions on the tail 
behavior of (,n,i,j- For example, the results proved in [18] assume that the 
random variables have bounded support, while [7] replaces this condition 
by various exponential decay conditions on the tail behavior of (,n,i,j- The 
primary technique employed in [7] to obtain consistency results uses the 
uniform constants for the exponential rate of sup-norm convergence of the 
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empirical distribution function to the true distribution function. This is then 
used to obtain results for the relevant partial sums of random variables using 
integration-by-parts techniques. While this approach yields useful results, 
the integration by parts required seems to obscure the true nature of the 
matter. From what we do here, we will see that it is more fruitful to study the 
problem from the point of view of the random variables themselves in that 
we are able to clarify some of the results described in [7], and also extend 
them under a broader range of conditions to our more general model. While 
we focus on the mean functional, [7] studies other interesting functionals of 
the data. 

The rest of the paper is organized as follows. Section 2 presents the main 
results. These results concern joint consistency and joint asymptotic nor- 
mality. Section 3 contains applications to hypothesis tests and Section 4 is 
devoted to simulation results and real data analysis. Section 5 contains the 
necessary probability estimates, while Sections 6 and 7 contain the proofs 
of our main results. 

Throughout the paper, Lx = L{x) = logg(max(a;, e)). 

2. The main results. The results that we obtain will depend critically 
on the tail probabilities of the random variable {^n.ij}- These assumptions 
are of two types, namely that the tail probabilities decay at an exponential 
rate, or that they decay polynomially. In the large p small n problem, our 
results imply that these tail probability conditions are closely tied to the 
way p must relate to n. For example, in the classic version of this problem 
where p grows exponentially fast in n, we need tail probabilities that decay 
exponentially fast, whereas if p grows only as a power of n, then we only 
need polynomial decay for the tails. The precise nature of this interplay 
for consistency results is contained in Theorems 2.2 and 2.3. In particular, 
the remarks following these theorems contain precise information on their 
relationship to the large p small n problem. 

First, we discuss the exponential decay case. Here, we assume that for 
some r, < r < 2, and all x > there are constants c„j- and k^j such that 



for all > 1, j > 1. Random variables satisfying (2.1) with r = 2 are usually 
said to be sub-Gaussian, and if for 1 < i < n we have that each in,i,j takes 
values in the interval [flnj; ^nj]) then we will see below that (2.1) holds with 





r = 2 



(2.2) c„j = 2 and A;„j = (2(6„j - o„j)^) ^, n>l,j>l. 
Throughout, when ^n,i,j = with probability one, in (2.1) we take 

(2.3) Cnj = 1 and kn,j = oo, n>l^j>l. 
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In addition, note that c„j > 1 is necessary by setting x = in (2.1). 

It is also useful to notice that if the condition (2.1) holds for some r* > 1, 
then it holds for all 1 <r <r* by simply adjusting the constants c„j and 
keeping the same knj- In particular, if (2.1) holds for some r > 2, then it 
holds for r = 2, and we are in the sub-Gaussian setting. In [7], this seems 
to have gone unnoticed, and there one finds results for r > 2 which are 
weaker than the corresponding r = 2 results. However, this should not be 
the case as the previous comment implies the r = 2 result applies directly 
to what is proved there. Of course, in some settings there could be results 
that distinguish between various r values, even for r > 2, but that does not 
happen here, and is why we restrict r to be in (0, 2] . Our methods also yield 
results when < r < 1, whereas in [7], the parameter r is always greater 
than equal to one. 

Another situation we will discuss is when the assumption of exponential 
decay of the tails of ^n,i,j in (2-1) is replaced by the polynomial decay 

(2.4) Pi\Cn,rj\>x)< "^"'{^ X>0, 

(l + x)'^"j 

where Cnj > 1 and typically for our results, 2 < knj < oo. 

We will assume throughout the paper that E{^n,i,j) = for all n,i,j > 1. 
Should this not be the case, one would simply replace the tail probabil- 
ity conditions in (2.1) and (2.4) by analogous conditions for the variables 
{S,n,i,j — E{S,n,ij)}, and formulate the results in terms of these variables. 

2.1. Consistency and rates of convergence. In this subsection we present 
several consistency and rate of convergence results for S.„^„ and Sn,n- 

Theorem 2.1. Lei {X„^j : 1 < i < n} be as in (1.2), assume (2.1) holds 
with r = 2, and take {an : n > 1} to he a sequence of positive numbers. Fur- 
thermore, CISSU7T16 CYij-fh^ij QjTc coTistcLTits such th(it Cxi^j ^ X^kyi j ^ oo (ind 

(eOn) kn,j / {lQCn,j)} < OO 

n>l j>l 

for all e > Eq. Then 

(2.6) Y.P{\\Sn,n\\oo>ean)<^ 

n>l 

for all £ > eo, where Sn,n is given as in (1.7). Thus, if the constants Cnj and 
kn.j are such that uniformly in n>l and for some 5 > 0, 



(2.7) 



kn,j/{l6Cnj) > 6L{j + 3), 
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then for all e > such that e^b > 1 we have 



In particular, if (2.7) holds, then with probability one 




and there exists an a> such that 




(2.10) ^(e"^^') <cx). 

Moreover, if the are replaced by n}!'^ in S. 

and (2.10) continue to hold. 



then again (2.6), (2.8) 



Remark 2.1. In Theorem 2.4, we will establish a central limit theo- 
rem for S„^„, and that Tn,n converges to zero in probability under related 
conditions. 

In Theorem 2.1, the impact of the random row sizes {Nn^i : i > 1} is hidden 
due to our choice of normalizations {a„} as given in (2.5). For example, (2.5) 
implies the ratio knj/cn^j cannot be bounded as j goes to infinity, but in 
our next result we only require this ratio to be uniformly bounded below 
in both n and j by a strictly positive constant. Under this different set of 
conditions, the role of {Nn^i : « > 1} appears in the normalizations for « 
given by h[n) in (2.12). In particular, if Nn^i = Pn for {i > l,n > 1}, then 
Theorem 2.2 and Remark 2.4 below yield the results in Corollaries 1 and 2 
in [7] when r = 2. The consistency results in [7] for 1 < r < 2, as well as for 
many other cases, follow immediately from Theorem 2.3 below. 

Theorem 2.2. Let {X„^j : 1 < z < n} be as in (1.2), and assume (2.1) 
holds with r = 2, and that for 1 < c < oo, < /c < oo, we have Cnj < c and 
kn,j ^ k for all n,j > 1. Let 




1/2 



(2.12) 





n>l 



Finally, if the ^"''^ replaced by n^^"^ in Sn,n, then again (2.12) and (2.13) 
hold. 
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Remark 2.2. Note that (2.12) immediately implies 



Sn,n||oo „ ((UE(K)) + n 



n 



1/2 



In particular, if L{E{N*))/n converges to zero, then (2.14) implies that 
llSn.nlloo/'^^^^ tends to zero in probability. In addition, (2.13) implies with 
probability one that 

(2.15) limsup%4^<l 

and hence S„.„/n^/^ converges to zero with probability one provided 6162 > 1 
and lim.„^oo "'""'^^(-^(A'^^t)) = 0. Furthermore, if Nn^i = Pn > n, {i > 1, n > 
1}, then (2.14) immediately relates to the results of Corollary 2 in [7], as it 
implies 

/r,i/;\ ||Sn,n||oo ^ / f L{pn^^ 

In particular, (2.14) improves Corollary 2 and its proof considerably when- 
ever r > 2 there, and the case < r < 2 will be discussed in what follows. 
Once we establish Lemma 1 below these results also apply to Corollary 1 of 
[7] in a standard way. 

We next study the situation when the random variables {(,n,i,j} satisfy 
the exponential tail condition (2.1) with < r < 2, or polynomial decay as 
in (2.4). When 1 < r < 2, a special case of these results clarifies Corollary 2 
of [7]. This can be seen in Remark 2.3 below. The r = 2 case in this corollary 
already appeared in (2.16) when Vnj =n. It should also be observed that 
Theorem 2.3 provides sufficient conditions for consistency which involve a 
precise relationship between the size of p„ in the large p small n problem, 
and the tail decay of the data. This relationship is shown to exist even when 
there is only polynomial decay in the data, and as one might expect in this 
situation the growth of p.„, or E{N*), needs to be further restricted, that is, 
in such results p.„ and E{N*) grow at a corresponding polynomial rate. 

Theorem 2.3. Let {X„^j : 1 < i < n} be as in (1.2) and assume that 
(2.1) holds with < r < 2. Also assume for all n>l and j > 1, that Cnj < c 
andknj > k, where 1 < c < oo,0 < k < 00. Let Sn = ci{L{E{N*)) + 2L{n)Y/'' , 
and 

(2.17) h{n) = {c^'L{E{N:)) + csL{n)f^ 
where ci > 2/k^^^' , C2 = A;/(128c), and C3 > 0. Then 

(2.18) lim p( >l) =0. 



n— s-oo 
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// we also assume C2C3 > 1, then 

(2.19) Yl ^(l|Sn,n||oo > n^^hnKn)) < 00. 

n>l 

Furthermore, if k > 2 and the polynomial condition in (2.4) holds, then 

(2.20) \\Sn,n\\oo = Op{r?l\sn{L{E{N*^))f'^), 

where Sn = {nE{N*)Y/^+P and /3 > 0. Additionally, if E{N*) >n, b>8, 
and k/3 > 1/2, then 

(2.21) 5^P(||S„,„|U > 6s„nV2(L(i?(Ar*)))V2) < ^o. 

n>l 

In particular, if E{N*) is asymptotic to n"' for 7 > 1, then 

(2.22) Yl ^(l|Sn,n||oo > bsnn'/\L{E{N:)))'/^) < 00, 

n>l 

provided b> 8 and (7 + l)kf3 > 1. 

Remark 2.3. An immediate consequence of (2.18) is that 

(2.23) l|Sn.n||oc ^ / (L(i?(jV-)) + L(n))(^+-)/(^-) 

and if (L(E(A^*))(2+^)/(20/nV2 ^ q, then (2.23) easily imphes S„,„/n con- 
verges to zero in probabihty. In addition, if Nn^i = Pn for n > 1, i > 1, where 
{Pn : ?^ > 1} is a sequence of integers, and p„, > n, then it follows from (2.23) 
that 



Hence, using the above for r G (0,2), and (2.16) for the case r = 2, one 
obtains an extension and clarification of Corollary 2 and its proof in [7] . It is 
also interesting to observe that the method of proof for Theorem 2.3 applied 
to the r = 2 situation only yields 

(2.25) llSn.nlloo ^Q^/^£(Pn) 

Hence, we see the methods used for the r = 2 case in Theorem 2.2 are sharper 
than those we have for other values of r. 

Remark 2.4. Under the assumption of polynomial decay given in The- 
orem 2.3, and assuming that E{N*) is asymptotic to n'^ for 7 > 1, we eas- 
ily see from (2.22) that ||S„^„||oo/?^ converges to zero almost surely pro- 
vided k is sufficiently large so that for /? > we have (7 -|- l)f3k > 1 and 
(7 + l)(l/A: + /5)<l/2. 
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2.2. Asymptotic normality results. In this section, we present results on 
the asymptotic normahty of the quantity S„^„. Since this estimator typicahy 
hves in cq, Theorem 2.4 is a central limit theorem in that setting. Never- 
theless, we also have proved CLTs in ip,2 < p < oo, similar to that found 
in Theorem 2.4. They appear in [8]. These results hold when the under- 
lying process is a triangular array with random row lengths and possibly 
missing data. We also are able to use the coordinate-wise random normal- 

1/2 

izations V^ j . However, we also use classical normalizations for some of the 
ip results, and in that case the related CLTs hold under far weaker moment 
conditions. See [1], page 206, for some classical results. The paper [14] con- 
tains CLTs in Co, as well as related references, and much is known about the 
CLT in the spaces ip,2 < p < oo. However, none of these results incorporate 
random row lengths, missing data, or coordinate- wise random normaliza- 
tions in their formulations. In addition, the results in [14] require a uniform 
boundedness assumption on the {^n,j,j} to obtain results related to what we 
prove. Finally, we mention that in our simulation results we include the use 
of our CLTs in ^p, 2< p < oo. 

A key assumption in any central limit theorem is that there is a limiting 
covariance function. Since our results include the use of random column- wise 
normalizers, we have need of a couple different limiting covariances. That 
is, if 

n 

r„Oi, j2) = X]-^(^"'*ji^"Ai2)/^ 
1=1 

is such that 

(2.26) lim r„(ji,i2) = r(ji,j2) 

n— )-oo 

for all ji, j2 ^ 1; then for k = l,2 we set 

(2.27) r(/c,ii,i2)=p'r(ji,i2) forji/j2, 
and 

(2.28) r(/c,ji,j2)=/-^r(ii,j2) forji=j2. 



Theorem 2.4. Let {X„^j : 1 <i<n} be as in (1.2), assume (2.1) holds 
with r = 2, and that c„j, A:„,j olts- constants such that Cnj > 1, knj < oo and 

(2.29) sup Cnj/knj < oo. 
Also assume for all 6 > that 

(2.30) lim sup > exp{—Sknj /cnj} = 0. 
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If Sn,n is given as in (1-7), then 

(2.31) {C{Sn,n) - nyi} is tight in cq. 

In addition, ifTn,n is as in (1.9), and for each j >1 we have linin^oo Pi^n,i < 

1/2 

j) = 0, then Tn,n converges in probability to zero in cq. Moreover, if the Vj^ 

are replaced by n)-!'^ in S„^„, then again (2.31) holds, and T.„^„ converges 
in probability to zero. Furthermore, if we also assume (2.26), (2.27), and 
(2.28) hold, and for each d < CO we /laue P(mini<j<„ A^^^j < c?) = o(l/n^) as 
n tends to infinity, then T[k,-,-) is the covariance of a centered Gaussian 
measure 7^ on cq for k = l,2, and 

(2.32) C{Sn,n) converges weakly to 71 

on cq. If the ^'"^ replaced by n^/"^ , then (2.32) still holds with limiting 

measure 72. 

Remark 2.5. The conditions (2.26), (2.29), and (2.30), along with (2.1) 
when r = 2, allow the limiting Gaussian measures 7^ to exist on cq. Moreover, 
without such assumptions, with the most important being (2.26) and (2.30), 
there are examples of triangular arrays of the form indicated when the CLT 
must fail on cq, although it may hold on Of course, without (2.26), then 
the CLT win fail even on R°° . 

3. Application to hypothesis tests. In this section, we deal with applica- 
tion of our results to one-sample problem, two-sample problem, and one-way 
random effects models. Joint hypothesis testing was also considered in [11]. 
We assume throughout this section that the distribution of the random vari- 
ables X„^j in (1.2) are independent of i. This implies that E{^n,i,j) = fJ'n,i,j 
is independent of i and we write it as fJ'nj. 

3.1. One-sample and two-sample problems. In this subsection, we apply 
our results to test if the "mean vector" equals a specified vector in the 
one-sample case and if the difference in the "mean vectors" is zero in the 
two-sample case. More precisely, for the one-sample case consider testing 
the null hypothesis Hq : = 0, where /x„ is an infinite-dimensional vector 
whose components are finj- The quantity Sn,n, defined by (1.7), can be used 
for developing a test of Hq. To this end, let us denote the data vectors by 
X„ = {X„_i, . . . , X„^„}. One can use the ip norm for p>2 and the cq norm 
to define various nonrandomized test functions 0p(X„) as follows: 



(3.1) 
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where c = Cp is so chosen that £'(0p(X„)|Ho) < a. The test function based 
on the Co norm is given by 

(3.2) 0oo(X„) = 

where c — Coo is so chosen that £'((/)oo(Xji)|Ho) ^ ol. In the context of the 
two-sample problem, the null hypothesis is Hq : ^l\ = /^^, where /ij^ repre- 
sents the infinite-dimensional mean vector from the feth population. Now, 
using Sn,n to denote S„,n for the fcth population [note that these are con- 
structed using a superscript k in all the quantities in (1.2) and (1.3) and 
these quantities are independent in fc], the test function for the Ip norm is 

(3.3) (/)(2)(X„) = 1 1' if ||si^n-SSl|p>c, 

[0, otherwise, 

where c = Cp is so chosen that £'((/)p (X„)|Ho) < a. The test function based 
on the Co norm is given by 

(3.4) 4>^^H'in) = I ^' 11^"'" " ^"'"11- > 

[ 0, otherwise, 

where c = Cqo is so chosen that £'((;/)^ (X„)|Ho) < a. We observe that under 
our formulation unequal sample sizes from the two populations are allowed. 
In some applications, the number of components in the two groups and the 
sample sizes coincide. In these cases, as in traditional multivariate analysis, 
the two-sample problem reduces to the one-sample problem. 

To perform the test one needs the distribution of ||Sn,n||p5 ||Sn,n||oo) 
||si^n — si^nllp, and \\Sn}i — Sn^nlloo- The following proposition provides the 
asymptotic distribution for these statistics. It's proof can be obtained as a 
corollary of Theorem 2.4 for the cq case, and Theorem 7 and Remark 11 
of [8] for ip spaces, and using the fact that the distribution function of the 
norm with respect to a Gaussian measure on a separable Banach space is 
continuous. 

Proposition 3.1. If the appropriate null hypothesis holds, then Theo- 
rem 2.4 implies that -P(||S„^„||oo > c) and P{\\Sn)n — si^n||oo > c) converges 
to P(||G||oo > c) and P(||G(^) - G^^^Hoo > c), respectively, for all c > 0, 
where C{G) = C{G^^^) = £(G(2)) = 7 and 7 is the Gaussian measure iden- 
tified there. Furthermore, G^^^ and G^^^ are independent. A similar result 
holds under the conditions of Theorem 7 of [8] when the infinity norm is 
replaced by the p-norm. 



J li if ||Sn,n||oo ^ C, 

1 0, otherwise. 
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3.2. One-way random effects models. In the analysis of gene-expression 
data, the random effects model with random plate effect is often used for 
data analysis. That is, if £,n,i,j,k sue the expression levels of the jth gene in 
the ith replicate, receiving the kth treatment, then for k = l,2 the one-way 
random effects model is given by 

S,n,i,j,k — l-l"n,j,k ~l~ T^^i^k ~\~ ^n,i,j,ki 

(3.5) 

2 = l,2,...,n,j = l,2,...,6(n), 

where /in,j,fc is the mean expression level for the jth gene receiving the fcth 
treatment in the lab n, Tn^i^k are independent Gaussian random variables 
with and variance ^ and £n,i,j,k are i.i.d. random variables with mean 
and variance o"^. More complicated models that take into account tip effect 
and dye effects have been studied in the applied literature. The index n 
in the subscript is usually suppressed in the applied literature but we keep 
it to show the relationship with our model. The random variables T^^i^k 
introduce correlations in the expression levels of in,i,j,k across genes and 
a standard calculation shows that this correlation structure is compound 
symmetric. This model can be seen as a particular case of our model with 
Nn,i = b{n), Rn,i,j = 1, and compound symmetric covariance matrix. Propo- 
sition 3.1 above can be used for performing hypothesis tests concerning the 
expression levels of multiple genes simultaneously. Furthermore, the models 
developed in the paper allow for some extensions of the one-way random 
effects models to incorporate missing data and random number of parame- 
ters. 

4. Simulation results and real data analysis. 

4.1. Simulation results. In this section, we evaluate our methodology, 
using simulations, when the number of replications is small, but the number 
of variables is large. All our simulation results are based on 5000 independent 
trials of 10 replications. We purposely chose n small to reflect many real 
applications. 

As a first step, we need to "approximate" the limiting distribution of the 
random variables appearing in Proposition 3.1. We will work with the case 
Nn,i = b{n), and assume that X^,!; ■ • ■ > -^n.n are n i.i.d. 6(Ti)-dimensional 
vectors with distribution Gn{-), whose coordinates have tails that satisfy 
the sub-Gaussian property. Let S„ denote an estimate of the covariance 
matrix S.„ = {{crn^u,v)), where S„ is a b{n) x b{n) matrix given by 

(4.1) 0'n,u,v — E[£^n,l,u ~ /^n.L) f ~ /^n,i))- 

In the above definition, fin)a and the specified values under the 

null hypothesis. Note that S„ is a function of the data vector X„^„. One 
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choice for S„ is the sample covariance matrix. In fact, better options are 
available, and we will explain them later below. If S„ is positive definite, 
then given X^^^, we generate t i.i.d. random vectors Y„^, of dimension b{n) 
whose distribution is Gaussian with mean vector and covariance matrix 
S^; that is, 

(4.2) Y„,i|X„~iVb(„)(0,E„) a.s.,l<i<t. 

We will call Y„^j the Monte Carlo (MC) samples, and throughout the sim- 
ulations t = 2000. We will use || • || to denote the Ip norm (p > 2) or the cq 
norm depending on the space being used. Let ||Y„_i||, . . . , ||Y„^.„|| denote the 
norms of the MC samples. Furthermore, consider the following nonparamet- 
ric density estimator; namely, for x £ R, 

(4.3) n.(.) = l-±K[^^). 

i=l ^ ' 

where Cf is a sequence of positive constants converging to such that tct — )• 
oo, and i^'(-) is a density function with Jp,tK{t) dt = 0. In the above, we have 
suppressed the dependence on n and on uj since n and u will be held fixed in 
this discussion. It follows from Devroye [2] that as t — )■ oo, that for every fixed 
u! € i}, ht{x) converges almost everywhere with respect to Lebesgue measure 
and in Li to the probability density of the random variable ||A^(0, S„)||. In 
all our numerical experiments, we will take K{-) to be a standard normal 
density, t = 2000, and fix the window width q at 0.7. Figure 1(a) presents the 
graph of the density function for the 2-norm, the 10-norm, and the sup-norm. 
We use these densities to "approximate" the tail probabilities of the norms of 
the limiting Gaussian appearing in Proposition 3.1. Figure 1(b) shows that 




(a) (b) 

Fig. 1. Kernel density estimates of the norms of statistics and the histogram of p-values 
under the sup norm. 
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the values from these hypothesis tests are uniformly distributed when we 
use the sup-norm. This histogram was generated using the p-values from a 
hypothesis test for data generated using a compound symmetric covariance 
structure. The vertical axis represents the proportion of times the p-values 
belonged to a particular range. 

Structured and unstructured covariance estimation, shrinkage and spar- 
sity. The methodology described above requires an estimate of the covari- 
ance matrix. It is folklore in statistics literature that covariance matrix esti- 
mation is a hard problem, and the difficulties increase when the number of 
variables is much larger than the sample size. The papers [10] and [17], as 
well as others, have clearly demonstrated that the sample covariance matrix 
behaves poorly in terms of the mean square error. This difficulty is frequently 
due to having a large number of parameters to estimate, and a limited num- 
ber of observations available to estimate them. Hence, it is reasonable to 
expect that if one uses structured covariances some of these difficulties can 
be mitigated. However, one has to be cautious since it is also well known 
that assuming independence when correlations are present lead to substan- 
tial bias in type-I error rates. Extensive simulations concerning these issues 
are described in detail in [8] and [9]. Thus, to make the methodology pre- 
sented here useful and applicable, we need to describe how to handle the 
covariance matrix estimation. 

In studies involving the joint analysis of multiple cDNA microarray data, 
[16], one frequently encounters covariances that have a block structure. This 
phenomenon also occurs in the context of sparse covariance estimation where 
regularization is adopted. Borrowing the idea from shrinkage estimation, 
[10] developed an estimator of S„ by taking a convex combination of the 
unstructured sample covariance matrix and a structured covariance matrix. 
Their estimator is given by 



where S„ is the method of moments estimator and I]„ is an estimator as- 
suming a particular structure for the covariance matrix. The parameter A 
can be estimated from the data, and has a closed form expression when X]^ 
is taken to be the identity matrix, or is compound symmetric, heterogeneous 
compound symmetric, as well as many other structures. We propose to use 
(4.4) as the estimator of the covariance matrix needed for generating sam- 
ples in (4.2). Our extensive simulations described in [8] show that when one 
shrinks the variances and the covariances, the type-I error rate approaches 
the nominal values for the test based on the sup-norm. These methods can 
also be applied to situations when there is mean sparsity. In these cases, one 
can apply the LASSO type algorithm for estimation purposes and then use 
our methodology for hypothesis testing. 



(4.4) 
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Table 


1 








Type I 


error rates with unstructured 


covariance matrix and shrinkage 








p = 1 




p = 4 


p = oo 


6(n) = 


100 


0.1312 




0.0838 


0.0686 


fe(n) = 


500 


0.1914 




0.0874 


0.057 


fe(n) = 


1000 


0.237 




0.1042 


0.0526 



Simulation analysis using real data information. We now describe a nu- 
merical experiment which examines whether some of the difficulties de- 
scribed in the previous subsection due to covariance matrix estimation can 
be minimized by using the shrinkage method for estimating the covariance 
matrix. To make the model closely resemble micro-array data, we consid- 
ered the leukemia data set described in [6] which studies the gene expression 
in two types of leukemia, acute lymphoblastic leukemia (ALL) and acute 
myeloid leukemia (AML). We use the same preprocessing step as described 
in [3], Section 3.1; retaining 3571 genes from 72 patients, 38 ALL and 25 
AML. We apply the standardization technique described in Section 3.3 of [3] 
on these retained genes. For our simulations, we consider information from 
the AML group only. The nominal type-I error rate is taken to be 5% in all 
of our experiments. 

The simulation experiments are based on data generated from a h{n)- 
dimensional normal random variable with n = 10. We now describe how 
the mean and covariance matrix are obtained from the data. First, we fix 
a h{n) and randomly generate 6(n) genes from the 3571 genes. From the 
25 AML patients, we estimate the 6(n)-dimensional mean vector /2„ and 
the corresponding covariance matrix S„ by shrinking the covariances as 
described in [17]. The shrinkage method yields a nonsingular covariance 
matrix. For all the 5000 simulations, this mean vector and the covariance 
matrix are fixed. Now, using the shrinkage method described above, we apply 
our methodology. The resulting type-I error rate is described in the Table 1 
above. 

From the table, we notice that the type-I error rate is closer to the nom- 
inal value for larger values of p. Further analysis (data not presented here) 
shows that this phenomenon existed for the other choices of variances and 
covariances as well (see [8]). 

4.2. Real data analysis. In a typical microarray situation, one is inter- 
ested in identifying if a set of genes are differentially expressed. This is a 
two-sample problem and one can use the methods described in Proposition 
3.1 to address this problem. We analyzed the leukemia data set described 
in [6], which also was used in the previous section. We analyzed the sixteen 
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genes given in Table 2 of [19]. The value of the test statistics defined in 
(3.3) were determined to be 4.4992, 2.7396 for p = 2 and p = 4, respectively. 
Also the value of the test statistic in (3.4) was determined to be 2.0250. 
The covariance matrices needed to apply the methodology were calculated 
by shrinking the variances and covariances using the algorithm described in 
[17]. The p- values corresponding to these test statistics were all less than 
10~^ showing that the genes are differentially expressed. The same conclu- 
sion was also obtained by Yan et al. [19] using different methods. More 
importantly, as explained in [19], these genes have biological significance 
and the three existing statistical methods in popular use did not identify 
them to be differentially expressed. Even though this example involved only 
sixteen genes, the simulation results in the previous section show that such 
a limitation is not necessary. Thus, this analysis combined with our simu- 
lation results described above, show the importance and usefulness of the 
proposed methodology. 



5. Some probability estimates. Here, we provide some basic probabil- 
ity estimates used throughout the paper. The first lemma deals with the 
sub-Gaussian situation, and the inequality we present for bounded random 
variables is not best possible, as slightly better constants in the basic es- 
timate can be obtained from Theorem 1 in [5]. Nevertheless, we include 
a proof for this case, as our argument generalizes to the unbounded case. 
Our approach is to compute the necessary Laplace transforms, and then 
use Markov's inequality efficiently. This is standard for such problems, but 
in order to proceed from first principles, and also keep track of relevant 
constants, we include the details. 



Lemma 5.1. Let Xi, . . . ,Xn be independent random variables with E{Xi 
0. // P{X^ G [a, b]) = l forl<i< n, then 



(5.1) 



P 



i=l 



n>x ] <2ex.p{-nx''{2{b- af) ^} 



for allx>0. In particular, when n = 1 each Xi is sub- Gaussian with relevant 
constants c = 2 and k = (2(6 — a)'^)~^ . If 

(5.2) P{\X,\>x)<ce-'''=' 
for 1 <i <n and all x>0, then 



(5.3) 



P 



i=l 



n> X \ < 2exp{— nA;x^/(16c)}. 
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Proof. First, observe that if y is a mean zero random variable, then 
Jensen's inequahty imphes E{e^^) > e*^^^) = 1 for all real t. Thus, for Yi, 1^2 
independent copies of Y, we have E{{Yi — = for ^ odd, and therefore 

(5.4) E{e^^) < ^(e*(^i-^2)) = i + ^t2«^((y^ _ Y2f)/{2iy.. 

i>i 

If P{Xi £ [a, 6]) = 1 for 1 < i < n, then E{(Yi - Fa)^') < (& - a)^' and since 
(20! > 2'(/!)2 for / > 1 we therefore have 

i?(e*>') < 1 + Y,t^^{b-af/{2Hl) = (f^^-fl'^. 

Applying this estimate to each of the Xj's for 1 < i < n, the independence 
of the Xj's and Markov's inequality implies that for each t > we have 



P Xi/n >x\< e""*^ J[ E(e*^» ) < e'^^*^- 



-t2{fe-a)2/2) 
\j=l / i=l 

Since x > 0, minimizing the right-hand side term over t > 0, we take t 
x/{b — a)^, and hence 



■nxV(2(fe-a)2) 



.4 = 1 



Applying the previous argument to — X^"=iXi5 we thus have (5.1). 
To prove (5.3), we first show that if E{Y) = and 

(5.5) P(|y| > x) < ce-'^^' 
holds for all 2; > 0, then 

(5.6) E(e*^) < e^'"'"'^ 

for all t > 0. This can be done by utilizing (5.4) and by showing for Yi,Y2 
independent copies of Y that 

E{{Yi - Fa)^^) < 2c(4/A:)'/ / e'^s^-^ ds = 2c{A/k)H\ < {Sc/k)H\. 



Thus, (5.6) holds and by applying the previous inequality, independence, 
and Markov's inequality as before, we have 

P^^Xj>nx^ < exp{-n/cxV(16c)}. 

Applying the previous argument to — YUl-i^ii thus have (5.3), and the 
lemma is proven. □ 
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Lemma 5.2. Let {X„^j : 1 <i<n} be defined as in (1-2), and assume 
(2.1) holds for r = 2, and the constants Cnj,knj are such that 1 < Cnj and 
< knj < oo. // Qdi^) = J2j>d+i ^j^j ^ ^ ' '^'^^ ^n,n is givcn as in 
(1.7), then for all d>0 and 5>0 

(5.7) P{\\Qd{Sn,n)\\oo>S)< ^ 2exp{-6^kn,j/{16Cn,j)}. 

j>d+i 

In addition, if the Vnj are replaced by n^/^ in Sn,n! then again (5.7) holds. 



Proof. We first establish (5.7) for general Vnj- When the Vn^j are re- 
placed by n^/^, the result then follows by an immediate simplification of this 
argument. 

If On,i,j = I{j < Nn,i)Rn,i,j as indicated, then P{On,i,j = 1) =Pn,jP, where 
Pnj = P{j < Nn^i) for n > 1, J > 1, and for k = 0,1, . . . ,n we define the events 



(5.8) 



Fk,n,j — 1^ Fl, 



where 1k,n,j denotes all subsets I = {ii, . . . , i^} of size k in {1, . . . , n} and 

Fi = {9n,i,j = 1 for ah i G / and 9n,ij = for i G {1, . . . , n} n 

Note that Fi depends on n and j, but we suppress that in our notation. 

Since Kj = max{l, X^ILi and YA=iin,i,jOn,i,j = on £^o,nj, we 

therefore have for each 5 > 0, n > 1, and d > that 



j>d+lk=l 



i=l 



Now 



1=1 



lex, 



1 = 1 



> 5k^'^ \ n Fi 



and letting 



j>d+l k=l 



i=l 



> 5<f n E. 



'k,n,j 1 
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we have by using the independence of the various sequences of random vari- 
ables involved that 



E E E ^-1 

j>d+lk=ll£Xk.„j 



1=1 



>6k^/^ \]p{FjJ = {h,...,ik}) 



<2 E E^^P^-'^'^"j/l*^^"j}^(^'i."j)- 

j>d+l k=l 

Of course, in the previous inequality we are applying (5.3) of Lemma 5.1 to 
estimate P{{\ ^i=i^n,ii,j\ > 6k^^^}). Thus, we have (5.7) for general Vnj- 

When the Vnj are replaced by n, the proof is immediate since the random 
variables {^n,i,j(^n,i,j :n>l,i> l,j > 1} also satisfy (2.1), and hence one can 
apply (5.3) immediately to obtain (5.7). Hence, the lemma is proven. □ 

In order that the probability estimate in the previous lemma be useful 
kn,j/cn,j must be unbounded as j tends to infinity. Our next task is to see 
what happens if we remove this assumption, and only ask that this ratio is 
uniformly bounded below by a strictly positive constant. This is the content 
of our next lemma, which is a modification of Lemma 5.2. 

Lemma 5.3. Let {X„^j : 1 <i<n} be defined as in (1-2), and assume 
(2.1) holds for r = 2, and the constants c„j, are such that 1 < c„j < c < 
oo andO < k < knj < oo. IfSn,n is given as in (1.7), and N* = maxi<j<„ A^^^^j, 
then 

(5.9) P(||S„,„||oo > x) < 2S(iV:)exp<^ - — 



16c 

In addition, if the Vnj are replaced by n}!'^ in S„,^n, then again (5.9) holds. 

Remark 5.1. If Nn^i = Pn for {i > l,n > 1}, then (5.9) immediately 
implies 

(5.10) P{\\Sn,n\\oo>x)<2pnexpl-—\ 

and if the are replaced by n^^'^ in S„^„, then again (5.10) holds. 

Proof of Lemma 5.3. Following the proof of Lemma 5.2, we observe 
that if 9n,ij = /(j < Nn,i)Rn,i,j, then P{9n,i,j = 1) = Pn,jP, where pnj = 
P{j ^ Nn^i) for n > 1, J > 1, and for m = 0, 1, . . . , n we define the events 



Fm,n,j — Pi ^ 
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where Im,n,j denotes all subsets I = {ii, . . . , im} of size m in {1, . . . , n} and 

Fi = {6n,i,j = 1 for alH G / and 9n,i,j = for i G {1, . . . , n} n I''}. 

Recall Vnj = max{l, J27=i ^^^^ observe that Y17=i ^■ri,i,j&n,i,j = on 

Eo^n,j- Hence, for each x > 0, n > 1, 

-f(||Sn,n||cxD ^ x) 



u>l j=lm=l \ I i=l 

Now setting Bn,m = {II]/=i Cn,i,,j| > xm^/^} 

n 



P 



>xV^!j'\nE^,n,,n{K = u} 



: Yl P{Bn,m.nFjn{N: = u}) 



and letting 



u>l j=l m=l 



1=1 



>xV^;'\nE^^,^,n{N* = u} 



we have by using the independence of the various sequences of random vari- 
ables involved and (5.3) of Lemma 5.1 that 

u n 

^n<'^Y.Y.Yl ew{-X^K,j/l6Cn,j}P{E.^,n,j H {N* = u}). 
u>l j=l m=l 

Thus, by first summing on m and using c„j < c and knj > k for all n,j > 1, 
we have that 

u 

(5.11) P(||S„,„||oo > x) < 2^^exp{-x2fc/(16c)}P(iV* =tx). 

U>1 j=l 

Hence, this implies (5.9), and when the Vnj are replaced by n, the proof 
follows from the ideas used in the general case. Thus, the lemma is proven. 
□ 



Next, we turn to a method which will allow us to handle a broader col- 
lection of random variables. Here the {£,n,i,j} satisfy (2.1) with r G (0,2), or 
the less restrictive conditions of polynomial decay given in (2.4). Of course, 
the results depend on the rate of decay of the tails of the {(,n,i,j}, but under 
a variety of assumptions we are able to obtain further consistency results in 
this setting. The relevant probability inequalities are obtained in our next 
lemma, and can be viewed as a substitute for those in Lemma 5.1. 
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Lemma 5.4. For each integer n>l let Xi, . . . , Xn he independent, mean 
zero random variables, such that for some r G (0, 2) we have 

(5.12) P(|Xi| >x) <ce-'=^'' 

for 1 <i <n and all x>0. Then for all x > \/^Mc^k,r/ o^f^d all s>Q 

... 



(5.13) 



i=l 



>nx\ < 4 exp 



nx 



32s2 / 



+ 4cn exp 



ks'' 



where M'^^^ = ce dx < oo. In addition, for all x > \/8Mc^k,r/V^ 
and all s> 1, we also have 



(5.14) P\ 



1=1 



nkx"^ 1 



>nx < 4exp + 4cnexp -- 



Moreover, if for some c> and k > 2, (5.12) is replaced by 

c 



(5.15) P{\X,\>x)< 

(1 + x)'^ 

for 1 <i <n and all a; > 0, then for all x > i/SAfc k/V^ o^nd all s>Q 



i=l 



(5.16) P\ 

where Ml^ = p^ j^^^ 



> nx j < 4 exp 

rdt <00. 



nx 
32? 



+ 



{2 + sf 



Remark 5.2. If we take s = nV{2+r)2,2/(2+r) ^y^^^ ^ > \/8Mc,fc,r/\/n, 
we have that (5.13) imphes 



P\ 



E^^ 

i=l 



>nx\ < 4 exp 



^r/(2+r)^2r/(2+r) 

32 



+ Acn exp 



^^r/(2+r)^2r/(2+r) 



which makes the exponents on the right of comparable size. Since the proof 
of (5.13) and (5.14) also imphes (5.13) and (5.14) when r = 2, it is interesting 
to note that the previous inequahty is not as sharp as that in (5.3) in Lemma 
5.1 when r = 2. 



Remark 5.3. If the median of each Xi is zero, then (5.13) and (5.14) 
hold for ah X > and s > 0. That is, when the medians are zero the key 
inequality (5.18) below follows directly from (5.8) in [4], page 147, without 
restrictions on x. A similar remark holds for (5.14) provided s > 1. 
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Proof of Lemma 5.4. First, we observe that for r fixed, uniformly in 
i, l<i<n, (5.12) implies E{Xf) < M^^^^ < oo. Hence if x > ^/SMc^k^r/Vn 

we have by Cheyshev's inequality that -P(IX]r=i — ^^^Z^) < 1/2. Now let 
li, . . . , be an independent copy of Xi, . . . , Xn and observe that 



P 



4 = 1 



>nx\P\ 



1=1 



<nx/2 ] <P\ 



i=l 



> nx/2 



(5.17) 

Then for all x > ^/8Mc^k,r/ have 



(5.18) 



P 



i=l 



>nx\ <2P\ 



1=1 



> nx/2 



Taking s > 0, we define {Xi - Y,y = {Xi - Yi)I{\X, - Yi\ < s). Then 



(5.19) 
where 

and 



P 



Y.^X,-Y,) 



i=l 



> nx/2 < /„(s,x) +//„(s,x). 



Inis,x)=P\ 



Y.^X,-Y,) 



i=l 



> nx/2 



IIn{s,x) = P( ma^\{Xi-Yi)-{Xi-Yi)'\>0 

Applying (5.1) to {Xi - YiY, . . . , (X„ - y„)^ we see that 
(5.20) Inis,x) < 2exp{-n{x/2f{2{2sf)-^} 

and (5.12) implies 



(5.21) 



IIn{s,x) < 2^P{\X,\ > s/2) < 2cnexpl - — 



i=l 



ks' 



Applying (5.18), (5.19), (5.20) and (5.21) we thus have (5.13). 

The proof of (5.14) follows that for (5.13) up to the point we apply (5.1) 
of Lemma 5.1 to In{s,x) in (5.20). At this point, we now apply (5.3) of 
Lemma 5.1 to the random variables {Xi — Y'i)*,...,(X„ — YnY . That is, 
(5.12) implies that for all x > and 1 <i <n that 

P{\{Xi - YiYl > x) < P(|Xf I > x/2) + P(|y/| > x/2) < 2ce-^'^'/(^"'), 

where the last inequality follows since x^' /T' > xV(4s2) when < x < 2s, 
< r < 2, and s > 1. Hence, by (5.3) of Lemma 5.1, with k replaced by 
/c/(4s^) and c by 2c, we obtain 

(5.22) Inis,x) < 2exp{-n/cxV(128cs2)}. 
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Now combining (5.22) and the estimate for IIn{s,x) in (5.21) to (5.18) and 
(5.19), we obtain (5.14). 

Next, we observe that uniformly in i, 1 <i <n, (5.15) and k > 2 imphes 



poo roo 

E{Xi)= P{\X,\>t'/')dt< - 
Jo Jo (1 



+ tl/2)fc 



dt < oo. 



Hence, if we choose x > y/SMc^k/ we have by the argument leading to 
(5.17)^(5.19) that 



(5.23) 
where 



P 



i=l 



>nx/2 <Inis,x)+IInis,x), 



Inis,x) = P\ 



i=l 



> nx/2 



and 



IIn{s,x) = max |(Xi - Yi) - {Xi - K,)^| > ) . 



Applying (5.1) to (Xi - YiY , . . . , (X„ - y„)^ we see that 
(5.24) Inis,x) < 2exp{-n{x/2f{2{2s)'^y^} 

and (5.14) implies 



(5.25) IIn{s,x) <Y,P{\Xi-Yi\ > s) <2j2Pi\Xi\ > s/2) < - 



2cn 



i=l 



i=l 



{l + s/2)k- 



Applying (5.23), (5.24) and (5.25), we have (5.16). Thus, Lemma 5.4 is 
proven. □ 

6. Proofs of consistency results. 

6.1. Proof of Theorem 2.1. Applying Lemma 5.2 with ci = 0, we have for 
all X > and each integer n > 1 that 



(6.1) 



-P(||S„,„||oo >2;) < y^2exp{-a:^fc„j/(16cn,j)}. 



Taking x = ea„ in (6.1) and applying (2.5), we thus have (2.6) for general 
Vnj'^ ^'^d also when the V^j^ are replaced by v}/"^ . 
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If the constants c„j- and knj satisfy (2.7) as indicated, then with a„ = 
(L(n + 3))^/^ and x = ea„ in (6.1) we have 

P{\\Sn,n\\oo > e{L{n + 3))^/') < Y,2e^^{-e'5L{n + 3)L{j + 3)} 

i>i 

(6.2) 

= 2^(j + 3)-^''^^("+3)_ 

Thus, for £^5> 1, 

^P{\\Sn,n\\oo>e{L{n + 3))'^'') 

n>l 

i-oo 

(6.3) ^2^/ 

3~L(ri+3)-l 

^'5^(L(n + 3)-l)<~ 

n>l 

and hence (2.8) holds for general Vnj- In particular, we then have from (6.3) 
that (2.9) is immediate, and it remains to show i?(e"^^^) < oo for all a > 
sufficiently small. Now 



dt 



and 



S E r Pf > (Mn + 3))"= (J^) dt 



n>l 



<2^^ r exp!.-6L{j + 3)L{n + 3)^^\dt, 
">ij>i " 

where the last inequality follows from (6.1) and that (2.7) holds. 
Therefore, for a < 6/2 we have 



poo 

E{e'^^'')<3 + 2^Y1 / exp{-2L(j + 3)L(n + 3)logt}(it 

exp{ - 2 log 3L ( j + 3) L (n + 3) } 



n>l j>l ■ 
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Now x,y > 1 + 77 for some 77 > implies xy > {x + y)(l + ri)/2 and hence 
since j,n>l implies L(j + 3), L(n + 3) > L4 > 1 + 77 for = L4 — 1 > we 
have 

Eje'^^' ) < 3 + 6 y y ^"P^" + ''^ ^["-^^ + 3) + L(n + 3))} 

2L(,+3)L(n + 3)-l 

since (1 + r/)log3 > 1. Since (5.1) holds when the are replaced by n^^^, 
the proof also holds in this situation. Thus, Theorem 2.1 is proven. 

6.2. Proof of Theorem 2.2. Under our assumptions, Lemma 5.3 with 
X = h{n) implies that 



-P( II Sn,n II 00 

> h{n)) < 2E{N*)ex];) 



kh{ny 
16c 



Since x = h{n) = [9];^ L{E{N*)) + e2Ln)^^^, Oi = k/{lQc) and 62 > 0, we 
have 

P(||S„,„||oo > h{n)) < 2L{E{N*))exp{-L{E{N*)) - OMn}. 

Since 61 > 0, we thus have (2.13) if 62 > 0, and (2.14) follows immediately 

provided 6162 > 1- Since the above holds for general Vl^{-^, and also the n^/^ 
normalizations. Theorem 2.2 is proven. 

6.3. Proof of Theorem 2.3. First, observe that for all 2; > that 





n 




(S{ 








i=l 





Letting b = (61, . . . ,bn), where bi is a positive integer for 1 < i < n, and 
setting ii^n,b = {^n,i =bi, . . . ,Nn^n = bn}, we thus have by conditioning on 
En^b that 

max(6i,...,6„) 

-P(||Sn,n||oo > nx) < ^ ^ .J{n,j,h,x)P{En,h), 

(bu-,bn) i=l 

where J{n,j,h,x) = Pi\Y17=i^n,i,j6n,i,j\ > nx\En,h)- Fixing n and j, and 
defining Xi = £,n,i,jQn,i,jI{j < bi) for 1 < i < n, we see Xi,. . . ,Xn are inde- 
pendent random variables, and it is easy to check from our assumptions on 
Cnj and knj, and (2.1), that for all rc > 



^'(l^il >x)<ce 



-kx^ 
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Therefore, Xi, . . . ,Xn satisfy the conditions in Lemma 5.4 and using the 
independence of the sequences {^n,i,j}, {Rn,i,j}^ and {Nn.^i}, we have for 
X > \/8Mc^fc^r/\/^ s > 1 that (5.14) imphes 

ks' 



J{n,j, b, x) < 4exp{— nfcrr^/ (128cs^)} + 4cnexp 



Combining the previous inequahties in this proof, we have that 

(6.5) P(||S„,„,||oo > nx) < 4E{N:)[An{l) + A„(2)], 

where An{l) = exp{-nA;x^/(128cs^)} and A„(2) = cnexp{-^}. Recalhng 
h{n) = {C2^L{E{N*)) + C3L(n))V2 and taking s = s„ = ci{L{E{N*)) + 2 x 
L(n))i/'' and x = x„ = s„{c^^L(£:(A^*)) + C3L{n)y/^ /n^/"^ in (6.5), then for 
all sufficiently large n we have x > ^/8Mc^k,r/ V^, s>l, and (6.5) holds. 
Thus, (2.18) holds if ci > and C3 > 0, and (2.19) follows if we also 

have C2C3 > 1. Thus, Theorem 2.3 is proven when (2.1) holds and < r < 2. 

We now turn to the situation where there is only polynomial decay in 
the tails of the data ^n,i,j as in (2.4), where Cnj < c and knj > k for all 
n > 1, J > 1 and 1 < c < 00, 2 < k < 00. Then the random variables £,n,i,j(^n,i,j 
are also easily seen to satisfy (2.4), and arguing as in (6.4) and (6.5), and 
applying (5.16), we have for s > and x > y/SMck/n^^'^ that 



(6.6) P(||S„,„||oo>nx)<4^(iV:) 



nx 2 cn 



32s2J (2 + s) 

Taking s = Sn = {nE{N*)y/''+l^ , /3 > 0, and x = x^ = bsn{LE{N*)y^^ /n^/^ , 
then for all n sufhciently large 

P(||S„,„||oo > bn'/^Sn{LE{K))'/^) < 4E{N:)[An{3) + A„(4)], 

where ^(3) = exp{-UL(^(iV*))} and ^„(4) = 2'=cn[2 + s„,]-'=. Thus, (2.20) 
holds when /3 > by taking b large and using the definition of s„. Moreover, 
(2.21) holds if 6 > 8 and /c/3 > 1/2, and (2.22) holds when if 6 > 8 and 
/3A;(7 + 1) > 1. Thus, Theorem 2.3 is proven. 

7. Proof of Theorem 2.4. The proof proceeds with a sequence of lemmas. 
Our first lemma provides tightness, and shows Tn,n converges in proba- 
bility to zero in cq. 

Lemma 7.1. Under the conditions (2.29) and (2.30), 
(7.1) {C{Sn,n) :n>l} is tight in cq. 

In addition, if for each j > 1 we have lim„_>,oo P{Nn,i < j) = 0, then T„^„ 
converges in probability to zero in cq. 
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Proof. For general V^j, or if the Vnj are replaced by n, (5.7) implies 
that 

(7.2) P{\\Qd{S n,n.) ||oo 

j>d+l 

Hence, (2.30) implies for S > arbitrary that 



(7.3) 



lim supP(||Qrf(S„,„)||oo><^) = 0. 

d->-oo„>i 



Now (2.1) easily implies E{^^^-) < Cnj/knj, and the independence of the 
sequences of random variables involved implies for each j >1 that 



i=l 



>bV^;']<b-'cn,,/kn,j. 



Thus (2.29), (7.3), and an application of the remark on page 49 of [13] easily 
combine to prove the tightness in (7.1) for general Vnj and also when the 
Vnj are replaced by n. 

If Tn^n is defined as in (1.9), then for each e > and d > 1 we have 



PiWTr. 



>2e)<P{\\Tn,n-QdiT n,n) ||oo 

>e) + Pi\\QdiT 



n,n) oo 



Now (7.3) immediately implies there exists d> 1 such that uniformly in 
n, -P(||(5d(T„^„)||oo > e) < e/2. Independence of the sequences of random 
variables involved then implies for each j >1 and 6 > that 



i=l 



>bVr. 



<b~^E{V-j)Cr,j/kn,r 

Since lim„_^oo P{Nn,i < j) = 0, the weak law of large numbers and Cheby- 
shev's inequality applied to the i.i.d. sequence of random variables {9n,i,j '■ i > 
1} implies for each fixed j >1 and M > that limsupjj_^oo P(^n,j < = 0. 
Thus, limsup„_^(^ E{V~j) = 0, so for each fixed j >l and 5 > we have 



lim P\ 



i=l 



> bVr 



0. 



Now 



P{\\Tn,n - Qd(T„,„)||oo >e)<Y,P\ 



i=l 



> eVn, 
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and hence from the above we have for each e > that 

hm -P(||T„ fjlloo 

> 2e) < e. 

Thus, the lemma is proven. □ 

Now that we have tightness of {C{Sn,n) ■tI' ^ 1} in cq, the next step of 
the proof is to show that the finite-dimensional distributions induced by 
Ud>i '^0 d same for every limiting measure of {£,{Sn,n) > 1}. Here 

Cq denotes the continuous linear functionals on cq and 

(7.4) = {/ G : /(Qd(x)) = for ah x G cq}. 

We start by showing that the limiting covariance functions r(A:, •, •) given 
in (2.26)-(2.28) determine the limiting variance of f{Sn,n) for each d> 1 
and / G Cq ^. This follows from our next lemma. 

Lemma 7.2. // (2.26)-(2.28) hold and P(mini<i<„ iV„,i <d) = o(l/n2) 
as n tends to infinity, then for all d>l and f & Cq ^ we have 

d d 

(7.5) hm i?(/2(S,,„)) = VVr(l,n,7;)/(e„)/(e,). 

u=l v=l 

If Vnj is replaced by n in ^ri^nj then ^^7.5) holds with r(l,*,*) replaced by 
r(2,-,-) in the right-hand side term. 

Proof. Since f £ Cq ^, we have 

(7.6) E{f\Sn,n)) = Y.Il 

u=l v=l 

where Tn,i{u, v) = E{^n,i,uin,i,v)- This follows immediately since the sequences 
{^n,ij} and {Hij} are independent of the sequence {^n,ij}; and E{^n,i,j) = 
with the random variables £,n,i,j independent in i. Hence, (7.5) and Lemma 
7.1 follows from (7.6) once we prove the following lemma. The situation when 
Vnj is replaced by n in S„^„ is simpler, so for the time being, we assume the 
Vnj are random. The nonrandom case will be taken up later. □ 

Lemma 7.3. Under the assumptions of Theorem 2.4, we have 

(7.7) _^iij„^f;r,.,.(u.„)£(^^)=r{i.„,.,). 



.i=l 



T/1/2 yl/2 



f{eu)f{ev), 
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Proof. Since limn^oo'^^=i^n,i{u,v)/n = r{u,v) by assumption, (7.7) 
will follow if we first show for u^v that 

(7-8) ^1 -1/2 ,.1/2 )=Z^n,i, 



V ' V 



V 



n 



where lim„_i.oo sup]^<j<;„|o„^j — 1| =0, and that 

(7.9) sup Tn,i{u, v) < oo. 

n>l,i>l 

Now 

(7.10) sup Tn,i{u,v)= sup 

n>l,i>l n>l,i>l 

and, as mentioned earlier, (2.1) with r = 2 implies E{^'^-^) < Cnj/knj- 
Therefore, the Cauchy-Schwarz inequality and (2.29) easily combine to im- 
ply (7.9). Hence, when v it remains to prove (7.8). 
To verify (7.8) for random Vnj, for n>l,u>l, let 

An,i = ^^^^^ and Wn,u = maxi 1, Rn,i,u \ ■ 
Thus, we have 

E{An,i) = E[Kn,il(^ mm iV„,, - ^)) + E[kn,il(y mm Nn,i < d 

(7.11) 

= Sn(l,0-^n(2,i) + 5„(3,i), 

where 

i?„(l,z) = e{^?^\ , i?„(3,z) = i^(A„,/( mm iV„, < d 



Wn u Wn . 



and 



B.(2,z)=£f^^l(min N^,<d 



w ' W 

n,u n,v 

Also, 

(7.12) Bn{3,i) < P( min Nn,i < d) = o(l/n) 
and 

(7.13) 5„(2,i) <pf min N^^i < d) = o{l/n). 

\l<i<n ' / 

Hence, (7.8) will follow if we show for all i,l <i <n, and u^v that 

(7.14) Bn{l,i) = ^an, 

n 
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where lim_j.ooOn = 1- To verify (7.14), we establish the following lemma, 
which immediately implies (7.14) when u^v. If -u = f, then from the above 
we see that the analogue of (7.14) required is that E{Rn^i^u/W'n,u) = Cn/n 
where lim„^oo Cn = 1- This follows since from the proof of Lemma 7.4 below 
we actually have = 1 — (1 — p)". Hence, the proof of Lemma 7.3 and also 
Lemma 7.2 for random Vnj, will follow once Lemma 7.4 is established. □ 

Lemma 7.4. Let {Xi : 1 < i < n} and {Yi : 1 < i < n} he independent col- 
lections of i.i.d. Bernoulii random variables with P{Xi = 1) = P{Yi = 1) =p. 
Let An = max{l, ^"^-^ Xj} and Bn = max{l, ^"^^^ 1^}. Then for all i, 1 < 
i <n, we have 

where lim_^oo = 1 • 



Proof. By the independence assumed, we have 
Xi Yi \ „ / Xi \ „ / Yi 



E -^—^ =E \E 



1 1 



.l/2„l/2/ \„l/2 

and hence since \Xi : 1 < i < n} and {Yi : 1 < i < n} i.i.d. Bernoulii random 
variables with P{Xi = 1) = Pi^i = 1) =p, it suffices to verify that 



(7.16) E 



4 1/2 



n 



where lim_j.oo c„ = 1. Now, using Jensen's inequality and an easy calculation, 
we have 



Hence, since we assume < p < 1, we have 



4V2/ xn 

Thus, Lemma 7.4 will follow provided we establish a comparable lower 
bound. 

Now for each e,0 < e < p, we have 
P _ 

{k:\k/{n-~l)~p\<e/L{n)} \i=2 
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where C{n,p,£) = (1 + (n — l)p— {n — l)e/L(n))^/^. Since e > 0, Theorem 1 
of [12] imphes 



{k: \k/{n-l)-p\<e/L{n)} 



k]>l- 2exp<^ -2(n - 1 



a=2 



Lin) 



and hence 



E 



.1/2 



> 



C{n,p,e) 



1 - 2exp<^ -2(?i - 1) 



Lin) 



1/2 



where liuin^oodn = 1- This imphes the comparable lower bomid to (7.17), 
and therefore Lemma 7.4 holds. 

As mentioned earlier, Lemma 7.4 completes the proof of Lemma 7.3, and 
hence Lemma 7.2 is established with (7.5) providing a limiting variance 
function when the Vnj are random. If the Vnj are replaced by n in S„^„, 
then the proof of Lemma 7.3 with the right-hand side of (7.7) being r(2, u, v) 
is much simpler, and the details are left for the reader. Hence, Lemma 7.2 
is proven. □ 



Now that Lemma 7.2 is verified, the next step is to show for all d > 1, / G 
Cq ^, and random Vnj that all limit laws of {£(/(S„^n)) : n > 1} are centered 
Gaussian random variables with variance given by 

d d 

(7.18) CT^ (/) = J] r(l, n, ^;)/(e.)/(e,). 

u=l v=l 

Of course, if the Vnj are replaced by n in S„^„,, then (7.18) holds with 
r(l,u,t>) replaced by r(2,u,u). 

To verify this step of the proof, we first prove a lemma which will put us 
in position to allow an application of Lyapunov's central limit theorem. 

Lemma 7.5. For each integer d>l and x G cq, let Hdi^) = Yl'j=i^j^j- 
Under the conditions of the theorem, we have for each d>l that 

n 

(7.19) hm ^ii;(||n,(x„,,)||^) = o, 

i=l 

where 

J^n,i - — Y/^ ej. 

j>l n,i 
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Proof. By Jensen's inequality, we see that 



|nrf(x„,,)||^< 



\Cn,i,j\0- 



T/1/2 



d M 



< 23{d-l) '^n,i,j''n,i,j 
j=l n,j 



Hence, 



E(l|nj(X„,)||^)<23l''-')^E 



and the lemma will follow if we show 



(7.20) 



lim 



i=l 



y2. 



for i = 1, . . . ,d and all d > 1. Now E{^-^^^^) < {E{en,,,)Y'\E{'-^)Yl\ 

n,j n,j 

and using (2.1) with r = 2 we have -E(^^_jj) < 24cn,j/kf^j. Applying (2.29) 
and that Cnj > 1 for all n>l,j >1 we therefore have, uniformly in n and 
j, that i?(^^ jj) < oo. Moreover, since P(mini<j<„ A^^r^^j < d) = o(l/n^), one 
can show that 



E 



-pE 



1 



+ o(l/n2). 



Therefore, let 



^"=E(m)4^(2] 

fc=0 ^ ^ \i=l 



and we want an appropriate upper bound on An-i- Now An < i?n + 2exp{— 2n/ 
(L(n))2}, where 



{fc: |fc/n-p|<l/L(n)} ^ ^ \i=l / 



and the exponential term follows from an immediate application of Theorem 
1 in [12]. Now 



Br,< 



< 



" - (np)4[l + l/{np) - l/(pL(n))]4 " (np)4 
for all n sufficiently large. Therefore, for all n > no, we have 



An-l < 



((n - l)p)4 



+ 2exp 



2(n- 1) 
(^(n-l))^]' 
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which imphes 

uniformly in i > l,j > 1. Thus, uniformly in j > 1, we have 

which implies (7.20). Thus, (7.19) holds by the inequality prior to (7.20), 
and Lemma 7.5 is proven for random Vnj- If the Vn^j are replaced by n in 
Sn,n; then the proof is even easier and details are left to the reader. Hence, 
Lemma 7.5 holds for both normalizations. □ 



The next lemma completes the proof of Theorem 2.4. 

Lemma 7.6. The functions T {I,-, ■) andT{2,-,-) defined hy (2. 26)-(2.28), 
are covariances of centered Gaussian measures 71 and 72, respectively, on 
cq. Furthermore, if the Vnj are random, then Sn,n converges weakly to 71 
on Cq, and if the Vnj are replaced by n, then Sn,n converges weakly to 72 on 
Cq. In addition, for each f £ Cq and k = l,2, we have 
„ 00 00 

/ /2(x)d7fc(x) = J]j;r(A;,n,^;)/(e„)/(e,). 

•^^0 u=lv=l 

Proof. First, assume the Vnj are random. Then since (7.19) is verified, 
we also see for all d > 1 and f ^ that 

n 

lim y^E{f\±n,i))=Q. 

i=l 

Hence, by (7.5) and Lyapunov's central limit theorem, we have that f{Sn,n) 
converges in distribution to a mean zero Gaussian random variable with 
variance given by (7.18) for all d>l and / G Cq ^. If /i is a probability 
measure on the Borel subsets of cq, and for all > 1, d > 1, /i, ■ • ■ , /fc £ Cq ^, 
and A is an arbitrary Borel set of R^, then the probability distributions 

Fh^-'f^{A) = ^^({x G CO : (/i(x), . . . , A(x)) G A}) 

are the finite-dimensional distributions of fi on cq induced by Ud>iCorf) 
and they uniquely determine fi on the Borel subsets of cq. In view of the 
tightness obtained in Lemma 7.1, we thus have that S„^„ converges weakly 
to a unique probability on the Borel subsets of cq , which for the moment we 
call /i. What remains is to show that for every / G Cg this limiting measure 
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makes / a centered Gaussian random variable with variance determined 
by r(l,-,-). Recalling that pointwise limits of centered Gaussian random 
variables are again centered Gaussian variables with limiting variances the 
limits of the variances, and that U(i>i '^o d weak star dense in Cq, it follows 
that /i is a centered Gaussian measure on cq. Furthermore, if / G Cq and 
/rf(x) = /(n(i(x)),x G Co, then for random Vnj we have 

„ d d 

/2(x)d^(x)= lim / /|(x)dMx)= hm J^5^r(l,^/,t;)/(e„)/(e„). 

Since supj>i -E(C^ij) < Cnj/knj, we have from (2.29), (2.26)~(2.28), and 
Cauchy-Schwarz that 

sup |r(i, ji,j2)| < oo. 

ii j2>i 

Now Cq = ^1 , and hence the dominated convergence theorem easily implies 
^ is a centered Gaussian measure on cq with covariance given by r(l,-,-). 
Moreover, for each /* G cq we have 

/oo oo 
/2(x) d/x(x) = ^ ^ r(l, ^z, t;)/(e„)/(e,). 
-0 u=lv=l 

Hence, when V^j is random, the centered Gaussian measure 71 exists as 
indicated, that is, its covariance is r(l,-,-), and fi = ji. Similarly, when the 
Vnj are replaced by n, then 72 exists as indicated, and ^ = 72- This last 
fact is easy to check by immediate simplifications of what we have done 
when Vnj is random, and the details are left to the reader. Hence for each 
choice of normalizers, there is a unique limiting Gaussian measure, and its 
finite-dimensional distributions are centered Gaussian measures determined 
by the appropriate covariance function. Therefore, the lemma is proved, and 
Theorem 2.4 is established. □ 
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